Skip to content

[wasm-split] Do not split out the start function#8711

Merged
kripken merged 12 commits into
WebAssembly:mainfrom
kripken:split.start
May 20, 2026
Merged

[wasm-split] Do not split out the start function#8711
kripken merged 12 commits into
WebAssembly:mainfrom
kripken:split.start

Conversation

@kripken
Copy link
Copy Markdown
Member

@kripken kripken commented May 15, 2026

Like imports, automatically keep it in the primary module.

@kripken kripken requested a review from a team as a code owner May 15, 2026 18:29
@kripken kripken requested review from aheejin and removed request for a team May 15, 2026 18:29
Copy link
Copy Markdown
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you handle it in multiSplitModule too?

@kripken
Copy link
Copy Markdown
Member Author

kripken commented May 15, 2026

Ok, I think I managed to do that. The first commit runs into a crash where the table is not present yet. I fixed that in the last commit by reordering operations: shareImportableItems needs the other code to run first so it can see which tables etc. need to be used.

The change to the existing test seems valid to me - now it is using the more accurate table definition from the primary module (which is larger).

However, I'm not very familiar with this code, so PTAL carefully!

Comment thread src/ir/module-splitting.cpp Outdated
indirectReferencesToSecondaryFunctions();
indirectCallsToSecondaryFunctions();
exportImportCalledPrimaryFunctions();
shareImportableItems();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the order change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I meant by this:

The first commit runs into a crash where the table is not present yet. I fixed that in the last commit by reordering operations: shareImportableItems needs the other code to run first so it can see which tables etc. need to be used.

That is the table needs to be created at the right time. I think...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why that would be related to this change 🤔

Copy link
Copy Markdown
Member

@aheejin aheejin May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't either... Can you give an example?
That order change was the gist of #8688, so without breaking a lot of new assumptions, it is not easy to flip...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an example, run the multi-split test in this PR. (That is, take all of this PR but the part in question. The test crashes because it doesn't find the table. Reordering makes the table exist - but I have no high-level understanding of the flow here 😄 so maybe it is the wrong fix)

Copy link
Copy Markdown
Member

@aheejin aheejin May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was a separate bug. The fix is in #8728. After this, I think we can change this PR so that this filters the start functions only in wasm-split.cpp and I don't think we need to touch module-splitting.cpp. See #8711 (comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment thread src/ir/module-splitting.cpp Outdated
Comment on lines +435 to +437
// The start function must always be kept in the primary module.
if (func->imported() || !configSecondaryFuncs.contains(func->name) ||
segmentReferrers.contains(func->name)) {
segmentReferrers.contains(func->name) || func->name == primary.start) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just do something like

if (primary.start)
  primaryFuncs.insert(primary.start)

at the start of the function, rather than in this loop?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need the if's else not to happen, though. That is, we need to add it in the primary and not in the secondary. If the loop body is not modified, it will add it to the second.

@kripken
Copy link
Copy Markdown
Member Author

kripken commented May 15, 2026

Looks like there are a bunch of tests that do not use the auto-updater, so I will need to update them manually. I did one in the last commit. I'll wait on the others to see if this fix is even the right one.

@tlively
Copy link
Copy Markdown
Member

tlively commented May 15, 2026

The fix looks right to me, but changing the order of splitting steps shouldn't be necessary.

@aheejin
Copy link
Copy Markdown
Member

aheejin commented May 15, 2026

Looks like there are a bunch of tests that do not use the auto-updater, so I will need to update them manually.

I didn't use update_lit_checks.py in some tests because it doesn't work very well on split modules. For example, it somehow omits some module elements. But other module elements are still shown. Not sure what the exact rule is, but I guess update_lit_checks.py wasn't designed to match resulting module elements to existing module elements well when there are split modules.

And AI is fairly good at updating those expectations 😀 I often give prompts like "Don't change whether each test uses update_lit_checks.py or not, and update expectations for tests in test/lit/wasm-split"

But I also don't see why this PR should change existing tests' expectations, who don't have start function.

Comment thread src/tools/wasm-split/wasm-split.cpp Outdated
Comment on lines +299 to +305
if (function->name == wasm.start) {
if (!options.quiet) {
std::cerr << "warning: cannot split out start function " << func
<< "\n";
}
continue;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you do the same thing in multiSplitModule after this line?

assert(currFuncs);

Then we don't need to do anything in module-splitting.cpp.

Also while you're at it you can add this there too, to be consistent with splitModule:

if (!function) {
if (!options.quiet) {
std::cerr << "warning: function " << func << " does not exist\n";
}
continue;
}
if (function->imported()) {
if (!options.quiet) {
std::cerr << "warning: cannot split out imported function " << func
<< "\n";
}
continue;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, we can factor these three checks out to a function use it in both places

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

aheejin added a commit to aheejin/binaryen that referenced this pull request May 19, 2026
The way we currently share the active table with secondary modules
https://github.com/WebAssembly/binaryen/blob/2c2509b5bfe399df400b2185caa370f56e563d32/src/ir/module-splitting.cpp#L1122-L1156
has missed an important case in multi-split. This is one of the
unexpected consequences of WebAssembly#8688.

Here the `secondary` module in this loop is a secondary module whose
function have been put in the active table, mostly by `getTrampoline`
function. When we have a call from `$A` to `$B` and `$B` is split,
`getTrampoline` puts `$B` on the active table, which will be converted
to a placeholder in `setupTablePatcing`. So the code above share the
active table (and its base global, if exists) with `$B`'s module.

The problem is we didn't share the active global with `$A`'s module. In
a two-way split this is not a problem because `$A` is in the primary
module. But in multi-split, `$A`'s module itself is another secondary
module, and this module needs to access the active table because it
should call the placeholder for `$B`. This does it in
`indirectCallsToSecondaryFunctions`.

---

I wish we can do the active table sharing for callee modules in
`indirectCallsToSecondaryFunctions` too and get done with it, but we
still need to share it in `shareImportableItems` for secondary modules
that have functions to be replaced in the active table, because there is
a case of existing secondary function names in the active table. See
`KEEP-NONE` RUN line in the test below:
https://github.com/WebAssembly/binaryen/blob/main/test/lit/wasm-split/basic.wast

In this test, we reuse the existing table as the active table, which
already has `$foo` in its element section. And that `foo` will be split.
This won't be converted to trampolines because
https://github.com/WebAssembly/binaryen/blob/5fcc1af12ae19032339ee5e56889e08b6912e8cb/src/ir/module-splitting.cpp#L947-L958
and because there is no trampoline, this doesn't get to have a call
instruction from primary->secondary, so we can't handle it in
`indirectCallsToSecondaryFunctions`.

---

This fixes the error discussed in
WebAssembly#8711 (comment).
aheejin added a commit to aheejin/binaryen that referenced this pull request May 19, 2026
The way we currently share the active table with secondary modules
https://github.com/WebAssembly/binaryen/blob/2c2509b5bfe399df400b2185caa370f56e563d32/src/ir/module-splitting.cpp#L1122-L1156
has missed an important case in multi-split. This is one of the
unexpected consequences of WebAssembly#8688.

Here the `secondary` module in this loop is a secondary module whose
functions have been put in the active table, mostly by `getTrampoline`
function. When we have a call from `$A` to `$B` and `$B` is split,
`getTrampoline` puts `$B` on the active table, which will be converted
to a placeholder in `setupTablePatcing`. So the code above share the
active table (and its base global, if exists) with `$B`'s module.

The problem is we didn't share the active global with `$A`'s module. In
a two-way split this is not a problem because `$A` is in the primary
module. But in multi-split, `$A`'s module itself is another secondary
module, and this module needs to access the active table because it
should call the placeholder for `$B`. This does it in
`indirectCallsToSecondaryFunctions`.

---

I wish we can do the active table sharing for callee modules in
`indirectCallsToSecondaryFunctions` too and get done with it, but we
still need to share it in `shareImportableItems` for secondary modules
that have functions to be replaced in the active table, because there is
a case of existing secondary function names in the active table. See
`KEEP-NONE` RUN line in the test below:
https://github.com/WebAssembly/binaryen/blob/main/test/lit/wasm-split/basic.wast

In this test, we reuse the existing table as the active table, which
already has `$foo` in its element section. And that `foo` will be split.
Because both `bar` and `foo` are split there is no cross-module call
between them to process in `indirectCallsToSecondaryFunctions`. This
won't be converted to trampolines because
https://github.com/WebAssembly/binaryen/blob/5fcc1af12ae19032339ee5e56889e08b6912e8cb/src/ir/module-splitting.cpp#L947-L958
and because there is no trampoline, this doesn't get to have a call
instruction from primary->secondary, but we still need to share the
active table with the secondary module, and we can't handle it in
`indirectCallsToSecondaryFunctions`

---

This fixes the error discussed in
WebAssembly#8711 (comment).
aheejin added a commit to aheejin/binaryen that referenced this pull request May 19, 2026
The way we currently share the active table with secondary modules
https://github.com/WebAssembly/binaryen/blob/2c2509b5bfe399df400b2185caa370f56e563d32/src/ir/module-splitting.cpp#L1122-L1156
has missed an important case in multi-split. This is one of the
unexpected consequences of WebAssembly#8688.

Here the `secondary` module in this loop is a secondary module whose
functions have been put in the active table, mostly by `getTrampoline`
function. When we have a call from `$A` to `$B` and `$B` is split,
`getTrampoline` puts `$B` on the active table, which will be converted
to a placeholder in `setupTablePatcing`. So the code above share the
active table (and its base global, if exists) with `$B`'s module.

The problem is we didn't share the active global with `$A`'s module. In
a two-way split this is not a problem because `$A` is in the primary
module. But in multi-split, `$A`'s module itself is another secondary
module, and this module needs to access the active table because it
should call the placeholder for `$B`. This does it in
`indirectCallsToSecondaryFunctions`.

---

I wish we can do the active table sharing for callee modules in
`indirectCallsToSecondaryFunctions` too and get done with it, but we
still need to share it in `shareImportableItems` for secondary modules
that have functions to be replaced in the active table, because there is
a case of existing secondary function names in the active table. See
`KEEP-NONE` RUN line in the test below:
https://github.com/WebAssembly/binaryen/blob/main/test/lit/wasm-split/basic.wast

In this test, we reuse the existing table as the active table, which
already has `$foo` in its element section. And that `foo` will be split.
Because both `bar` and `foo` are split there is no cross-module call
between them to process in `indirectCallsToSecondaryFunctions`. This
won't be converted to trampolines because
https://github.com/WebAssembly/binaryen/blob/5fcc1af12ae19032339ee5e56889e08b6912e8cb/src/ir/module-splitting.cpp#L947-L958
and because there is no trampoline, this doesn't get to have a call
instruction from primary->secondary, but we still need to share the
active table with the secondary module, and we can't handle it in
`indirectCallsToSecondaryFunctions`

---

This fixes the error discussed in
WebAssembly#8711 (comment).
aheejin added a commit that referenced this pull request May 20, 2026
The way we currently share the active table with secondary modules
https://github.com/WebAssembly/binaryen/blob/2c2509b5bfe399df400b2185caa370f56e563d32/src/ir/module-splitting.cpp#L1122-L1156
has missed an important case in multi-split. This is one of the
unexpected consequences of #8688.

Here the `secondary` module in this loop is a secondary module whose
functions have been put in the active table, mostly by `getTrampoline`
function. When we have a call from `$A` to `$B` and `$B` is split,
`getTrampoline` puts `$B` on the active table, which will be converted
to a placeholder in `setupTablePatcing`. So the code above share the
active table (and its base global, if exists) with `$B`'s module.

The problem is we didn't share the active global with `$A`'s module. In
a two-way split this is not a problem because `$A` is in the primary
module. But in multi-split, `$A`'s module itself is another secondary
module, and this module needs to access the active table because it
should call the placeholder for `$B`. This does it in
`indirectCallsToSecondaryFunctions`.

---

I wish we can do the active table sharing for callee modules in
`indirectCallsToSecondaryFunctions` too and get done with it, but we
still need to share it in `shareImportableItems` for secondary modules
that have functions to be replaced in the active table, because there is
a case of existing secondary function names in the active table. See
`KEEP-NONE` RUN line in the test below:

https://github.com/WebAssembly/binaryen/blob/main/test/lit/wasm-split/basic.wast

In this test, we reuse the existing table as the active table, which
already has `$foo` in its element section. And that `foo` will be split.
Because both `bar` and `foo` are split there is no cross-module call
between them to process in `indirectCallsToSecondaryFunctions`. This
won't be converted to trampolines because

https://github.com/WebAssembly/binaryen/blob/5fcc1af12ae19032339ee5e56889e08b6912e8cb/src/ir/module-splitting.cpp#L947-L958
and because there is no trampoline, this doesn't get to have a call
instruction from primary->secondary, but we still need to share the
active table with the secondary module, and we can't handle it in
`indirectCallsToSecondaryFunctions`.

---

This fixes the error discussed in
#8711 (comment).
@kripken kripken merged commit 072bcc4 into WebAssembly:main May 20, 2026
16 checks passed
@kripken kripken deleted the split.start branch May 20, 2026 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants